首页> 外文OA文献 >Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning

【2h】

Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning

机译：从声音中学习视觉：环境声音为视觉提供监督学习

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

The sound of crashing waves, the roar of fast-moving cars -- sound conveysimportant information about the objects in our surroundings. In this work, weshow that ambient sounds can be used as a supervisory signal for learningvisual models. To demonstrate this, we train a convolutional neural network topredict a statistical summary of the sound associated with a video frame. Weshow that, through this process, the network learns a representation thatconveys information about objects and scenes. We evaluate this representationon several recognition tasks, finding that its performance is comparable tothat of other state-of-the-art unsupervised learning methods. Finally, we showthrough visualizations that the network learns units that are selective toobjects that are often associated with characteristic sounds. This paperextends an earlier conference paper, Owens et al. 2016, with additionalexperiments and discussion.

机译：巨浪的声音，快速行驶的汽车的轰鸣声-传达出有关我们周围物体的重要信息。在这项工作中，我们表明环境声音可以用作学习视觉模型的监督信号。为了证明这一点，我们训练了卷积神经网络来预测与视频帧相关的声音的统计摘要。我们显示，通过此过程，网络学习了一种表示形式，该表示形式传达了有关对象和场景的信息。我们在几个识别任务上评估了这种表示形式，发现其表现与其他最新的无监督学习方法相当。最后，我们通过可视化显示网络可以学习对通常与特征声音相关的对象具有选择性的单元。本文扩展了较早的会议论文Owens等。 2016年，还有其他实验和讨论。

著录项

作者
Owens, Andrew; Wu, Jiajun; McDermott, Josh H.; Freeman, William T.; Torralba, Antonio;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning [J] . Andrew Owens, Jiajun Wu, Josh H. McDermott, International Journal of Computer Vision . 2018,第10期

机译：从声音学习视线：环境声音为视觉学习提供监督
2. Learning to localize sounds in a highly reverberant environment: Machine-learning tracking of dolphin whistle-like sounds in a pool [J] . Sean F. Woodward, Diana Reiss, Marcelo O. Magnasco PLoS One . 2020,第6期

机译：学习在高音的环境中定位声音：在游泳池中的海豚哨子的声音机器学习跟踪
3. Learning words' sounds before learning how words sound: 9-Month-olds use distinct objects as cues to categorize speech information [J] . Yeung HH, Werker JF Cognition: International Journal of Cognitive Psychology . 2009,第2期

机译：在学习单词发音之前先学习单词的发音：9个月大的孩子使用不同的对象作为线索来对语音信息进行分类
4. Ambient Sound Provides Supervision for Visual Learning [C] . Andrew Owens, Jiajun Wu, Josh H. McDermott, European conference on computer vision . 2016

机译：环境声音为视觉学习提供监督
5. Sound Source Localization in Complex Indoor Environment: A Self-supervised Incremental Learning Approach [D] . Zhang, Zeyu 2019

机译：复杂室内环境中的声源定位：一种自我监督的增量学习方法
6. Learning to localize sounds in a highly reverberant environment: Machine-learning tracking of dolphin whistle-like sounds in a pool [O] . Sean F. Woodward, Diana Reiss, Marcelo O. Magnasco, 2020

机译：学习在高音环境中定位声音：在游泳池中的海豚哨子的声音的机器学习跟踪
7. Ambient Sound Provides Supervision for Visual Learning [O] . Owens, Andrew Hale, Wu, Jiajun, McDermott, Joshua H., 2016

机译：环境声音为视觉学习提供监督

Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning

摘要

著录项

相似文献

相关主题

期刊订阅